Morpheme Graph Construction for Speech and NaturalLanguage
نویسندگان
چکیده
This paper describes a morphological analysis method of continuous spoken Korean to solve the integration problem of speech recognition and natural language processing. The method centers on a Viterbi search-based morphological analysis on top of speech signal processing and MLP-based phone recognition. The main contribution of this paper is to introduce a Viterbi search-based morphological analysis technique for agglutinative languages' speech processing. In several experiments, we obtained average 84.4% of continuous morpheme recognition performance in the morpheme graph directly built from the average 75.9% of phone recognition performance .
منابع مشابه
A Viterbi-based morphological analysis for speech and natural language integration
This paper presents a statistical/symbolic hybrid morphological analysis, called V-morph, for large scale speech and natural language integration for Korean. In the V-morph approach, statistical Viterbi-based lexical decoding and symbolic morphological modeling are integrated together on top of connectionist phoneme recognition engine. Linguistic characteristics of Korean are appropriately cons...
متن کاملA Morpheme-based Part-of-Speech Tagger for Chinese
This paper presents a morpheme-based part-of-speech tagger for Chinese. It consists of two main components, namely a morpheme segmenter to segment each word in a sentence into a sequence of morphemes, based on forward maximum matching, and a lexical tagger to label each morpheme with a proper tag indicating its position pattern in forming a word of a specific class, based on lexicalized hidden ...
متن کاملTowards better language modeling for Thai LVCSR
One of the difficulties of Thai language modeling is the process of text corpus preparation. Because there is no explicit word boundary marker in written Thai text, word segmentation must be performed prior to training a language model. This paper presents two approaches to language model construction for Thai LVCSR based on pseudo-morpheme merging. The first approach merges pseudo-morphemes us...
متن کاملUse of high-level linguistic constraints for constructing feature-based phonological model in speech recognition
Modeling phonological units of speech is a critical issue in speech recognition. In this paper, we report our recent development of an overlapping feature-based phonological model which gives long-span contextual dependency. We extend our earlier work by incorporating high-level linguistic constaints in automatic construction of the feature overlapping patterns. The main linguistic information ...
متن کاملModeling Cross-morpheme Pro for Korean Large Vocabulary Cont
In this paper, we describe a cross-morpheme pronunciation variation model which is especially useful for constructing morpheme-based pronunciation lexicon for Korean LVCSR. There are a lot of pronunciation variations occurring at morpheme boundaries in continuous speech. Since phonemic context together with morphological category and morpheme boundary information affect Korean pronunciation var...
متن کامل